Draft
Conversation
- Enable #![feature(portable_simd)] and #![feature(specialization)] (Nightly). - Implement DeinterleaveSupported trait to provide specialized SIMD paths. - Add SIMD-accelerated deinterleaving for f32, u8, i8, and i16 using a 8-lane configuration. - Introduce deinterleave_scalar_logic for the base case and tail processing. - Use a macro to reduce duplication across SIMD-supported types. - Add comprehensive unit tests covering odd sample counts and multiple types. - Add a Criterion benchmark to track deinterleaving performance (reaching ~1 Gelem/s). - Update documentation in AGENTS.md and agent_docs/ to reflect the new performance pattern.
…e gate - Implement SIMD-accelerated scaled conversion for u8 -> f32 (3x speedup). - Add 'simd' feature gate to Cargo.toml to allow stable Rust compilation. - Refactor Deinterleave and TypeConverter to use conditional compilation for SIMD. - Add TypeConvertSupported trait for specialized type conversions. - Add Criterion benchmark for type converters. - Update SIMD.md with roadmap and established patterns.
- Implement FreqShiftSupported trait with specialized SIMD path for Complex32. - Achieve ~4.4x speedup on complex rotation compared to naive scalar loop. - Use vectorized complex multiplications with periodic NCO re-sync (every 1024 samples) to maintain numerical precision. - Add Criterion benchmark for FreqShift. - Update SIMD.md with performance results.
- Address clippy::clone_on_copy warning by dereferencing NCO.
- Switch from specialization to min_specialization for better stability. - Add #[allow(incomplete_features)] to src/lib.rs to satisfy clippy in CI/CD.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When available (feature gate), implementation will leverage
core::simdto make it explicit how to vectorized loop.Optimization of Deinterleave, TypeConverter, and FreqShift using std::simd and specialization, with significant performance gains (up to 4.4x). Also implementation is behind a simd feature gate for stable Rust compatibility and addressed CI/CD concerns by switching to min_specialization and allowing incomplete_features.